M201 14 THE OPEN UNIVERSITY J 
Mathematics: A Second Level Course 


Linear Mathematics Unit 14 


Bilinear and Quadratic Forms 


J 


The Open University 


Mathematics: A Second Level Course 


Linear Mathematics Unit 14 


BILINEAR AND QUADRATIC FORMS 


Prepared by the Linear Mathematics Course Team 


The Open University Press 


The Open University Press Walton Hall MK7 6AA 


First published 1972. Reprinted 1976 
Copyright © 1972 The Open University 


All rights reserved. No part of this work may 
be reproduced in any form, by mimeograph 
or any other means, without permission in 
writing from the publishers. 


Designed by the Media Development Group of the Open University. 
Printed in Great Britain by 
Martin Cadbury 


SBN 335 01105 5 


This text forms part of the correspondence element of an Open University 
Second Level Course. The complete list of units in the course is given at 
the end of this text. 


For general availability of supporting material referred to in this text, 
please write to the Director of Marketing, The Open University, P.O. Box 
81, Walton Hall, Milton Keynes, MK7 6AT. 


Further information on Open University courses may be obtained from 


the Admissions Office, The Open University, P.O. Box 48, Walton Hall, 
Milton Keynes, MK7 6AB. 


1.2 


Contents 


14.1 


14.1.1 
14.1.2 
14.1.3 
14.1.4 


142 


14.2.1 
14.2.2 
14.2.3 
14.2.4 


143 


14.3.0 
14.3.1 
14.3.2 
14.3.3 
14.3.4 
14.3.5 


14.4 


14.5 


Set Books 
Conventions 
Introduction 


Bilinear Forms 


Definition 

Matrix Representation and Change of Basis 
Symmetric and Skew-symmetric Forms 
Summary of Section 14.1 


Quadratic Forms 


Definition 

The Polar Form of a Quadratic Form 
The Quadratic Taylor Approximation 
Summary of Section 14.2 


The Normal Form 


Introduction 

Getting the Matrix Diagonal 

Getting a Diagonal Matrix into Normal Form 
Uniqueness of the Normal Form 

An Application of Real Quadratic Forms 
Summary of Section 14.3 


Summary of the Unit 


Self-Assessment 


Set Books 


D. L. Kreider, R. G. Kuller, D. R. Ostberg and F. W. Perkins, An Intro- 
duction to Linear Analysis (Addison-Wesley, 1966). 


E. D. Nering, Linear Algebra and Matrix Theory (John Wiley, 1970). 


It is essential to have these books; the course is based on them and will 
not make sense without them. 


Conventions 


Before working through this correspondence text make sure you have 
read A Guide to the Linear Mathematics Course. Of the typographical 
conventions given in the Guide the following are the most important. 


The set books are referred to as: 
K for An Introduction to Linear Analysis 
N for Linear Algebra and Matrix Theory 
All starred items in the summaries are examinable. 


References to the Open University Mathematics Foundation Course 
Units (The Open University Press, 1971) take the form Unit M100 3, 
Operations and Morphisms. 


14.0 INTRODUCTION 


This unit branches out from the single-minded pursuit of linear transform- 
ations which has occupied us in most of the other vector space units in 
the course so far. We shall be looking at some non-linear functions on 
vector spaces. An example of such a function (in the space R?, represented 
by a plane) is the function that maps the number pair (x1, x3), represented 
by a point in the plane, to the square of the distance of that point from 
the origin; in symbols, this function is 


(xn X2)  — xf xi (Gr, x2) e R?) 


X, 


+ Xi » 


This function is not a linear transformation; it belongs to another class 
of functions on vector spaces, known as quadratic forms, which we shall 
be studying later in the unit. Quadratic forms are useful in geometry not 
only because of their relationship to distance as described above, but also 
because, for instance, all the curves known as conic sections (e.g. ellipses, 
parabolas) can be expressed in terms of quadratic forms. 


Quadratic forms also arise in various branches of applied mathematics. 
For example, in electric circuit theory, the heat produced in a resistor is 
given by Ri?, where R is its resistance and i is the current in it; the function 
i — —. Ri? is an example of a quadratic form. The heat produced in 
a more complicated circuit is given by a more complicated quadratic 
form; for example, in the network shown, the heat produced is 

Ri? RR. 


i, R, R2 i 


Rs 


E 


Since Kirchhoff's First Law (current flowing into a node equals current 
flowing out of the node) gives i; =i, +i,, the heat produced can also be 
written 


Riz + Raid + Ry(i, + i5. 


LM 14.0 


This depends on i, and i, through the function 
(ig) i) — 9 Rub Ri Ry, i) (Cis, i2) e R^). 
which is another example of a quadratic form. 


In order to relate the study of quadratic forms to linear mathematics, 
we approach it through the study of another type of function called a 
bilinear form.* The bilinear form associated with the quadratic form 
(x1, x1) ———+ x? + xl mentioned above is 


(Œi x2), Or ¥2)) —> x Y1 + X22 
xn x2), (Yu y2) € R? x R?) 


A bilinear form is a more complicated concept than a quadratic form, 
because its domain is not a single vector space but the Cartesian product 
of two vector spaces. In compensation, however, it has the advantage of 
being linear when we consider either of the vector spaces separately; this 
makes it possible to use what we already know about linear transforma- 
tions to help with the study of some non-linear transformations, namely 
the quadratic forms. In Section 3 we obtain normal forms for matrices 
representing bilinear forms, and use these to analyse the stationary points 
of suitable differentiable real-valued functions of two variables. 


* “Bilinear functional" might bea better name, because of the analogy with linear functionals, 


14.1 BILINEAR FORMS 
14.1. Definition 


How much does it cost to run The Open University? Included in the 
total cost will be an item that depends both on the number of students 
taking each course and on the materials for that course: suppose that x, 
students take course number I, x; take course number 2, and so on, and 
that y, is the cost of providing one correspondence package of any of the 
Courses, y; the cost per student of one tutor-marked assignment, and y; 
the cost of one computer-marked assignment, then the item we are con- 
sidering will be 
Cost = xi(a1X1 41232 4133) 


+ X2(G21 Ji + 4122 423 Y3) 

+ wee 

(0411 + Gna V2 + 0,3 V3) 
where n is the number of courses, a,, is the number of correspondence 
packages per student in the rth course, and a,; the number of tutor- 
marked assignments and a,, the number of computer-marked assignments 
in the rth course. The formula for cost defines a function with domain 
R" x R?, since the “student numbers vector" (x,, ..., x,) lies in A" and 
the “cost per teaching item” vector (y,, y; , ys) lies in R?. The reason for 
calling this function a bilinear form is that it is linear in each of the two 
vectors: fix (xi, ..., x,) and you get 


Ors y2, 33) —> Cost 
which is a linear functional; similarly, fix (y,, y2, Y3) and you get 
(Xis ..., Xa) ——9 Cost 


which is also a linear functional. 


READ from the beginning of Section IV-8 on page N156 to the end of the 
second example on that page. 


Notes 


(i) Definition The domain of f is U x V, and the codomain is F. The defini- 
tion of bilinearity (Equation (8.1)) can be paraphrased: for each fixed f, the 
function from U to F 

a ——— f(x)  (xeU) 


is a linear functional, and for each fixed a the function from V to F 


B—— f(a, B (Bev) 
is a linear functional. 
(ii) Example (1) This is a generalization of the dot product that you met in 
Unit M100 22, sub-section 1.6, in that {o,,..., aa} can be any basis, whereas 
the dot product which you met in the Foundation Course is expressed in terms 
of a basis consisting of three mutally perpendicular vectors of unit length. 
Many physical and mechanical laws are expressed in terms of this “ geometric" 
dot product. 
(iii) Example (2) Note that if we fixed a we would get the mapping 


Le: B —[ a(x) f(x) dx 


which we discussed in Unit 12, Linear Functionals and Duality, sub-section 12.3.2. 


Example 


This example further illustrates the link between linear functionals and 
bilinear forms. Let V be a vector space over a field F, and consider the 
function f with domain V x f and codomain F, defined by 


S: (a, 4) —— (à) — (ne V, $e). 
(Remember that P is the dual space to V.) 


Fixing an element ¢ of f results in the linear functional 
g:a-—> oz) eV) 
on V, while fixing an element « of V results in the linear functional 


4:9. —— pa) (ġe?) 


on P which you met in sub-section 12.2.2 of Unit 12. 
Thus f satisfies the definition of a bilinear form. 


Example 
If U = V = R’, then the function 


f: (Qs X23 X3) Yas 3233) — 9 X65 + x2 + x3) 
(Œn X25 x3), 0n. Y2» Y3) € R°) 


is a bilinear form, since by fixing y, we obtain a linear functional of 
(x, X2, x3), while by fixing (x,, x2, x4) we obtain a linear functional of 
Wo Yas V3). 


On the other hand, the function 


8: (as X23 X3) Yi Yas Y3)) —> xix +2 
(G3, X25 x3), Wi Yas Ys) € R°) 


is not a bilinear form, since by fixing (y,, Y2, y3), we get the function 
(x1, X2, X3) ——— xx, + constant — ((xj, x2, xi) e R?) 
which is not a linear functional, because, for instance, if a € R, 
alx, X2, x3) À ——9 (ax,)(ax2) = a!xix; 


and a!x,x; # ax,x;. 


Exercise 


Which of the following functions f with domain U x V, codomain R, are 
bilinear forms? 


(à) f(a, + azaz, bf, + 282) =a, +a, bb; — (a1, az, b b2 € R), 
where (a, a) is a basis for U and (f, $2} is a basis for V. 

(b) Saya, + a2 02, b, fs + b2 2) = a,b; + a; b, 
under the same conditions as (a). 

(c) Saya, + azaz, b,b, + b; By) = aid} 
under the same conditions as (a). 

(d) f(x 8)-0 (eu, fev) 

() f(g h=) — (ge U he V with U= V = C"(0, 1). 


Solution 


The functions (b), (d) and (e) are bilinear forms. Functions 
(8) and (c) are not bilinear forms; this can be seen by using the 
argument at the end of the previous example. 


In general, when testing a function from U x V into F for bilinearity, it 
is easier to fix each variable in turn and see if thc result is a linear functional 
in the other variable, rather than to use Equation (8.1). For instance, 
function (e) in the exercise above is bilinear because, if we fix g, then 
f (&, A) is proportional to A'(4), if we fix A, then J(e, h) is proportional to 
& (3), and we know that these differentiation mappings are linear. 


14.1.2 Matrix Representation and Change of Basis 


In our first example of a bilinear form (the one relating to the cost of The 
Open University), we displayed the formula in a rectangular array; 
Cost = x44,1 Y + X10412 2 + X10 33 


+ X24211 + X2422Y2  Xi033J, q) 
Gites 


+ Xun X052 2 + Xn an3 Y3 
This suggests the use of matrices, and indeed the formula can be written 
Cost = XTAY 


_ where X7 = [x, x; +++ x,] (since a one-row matrix is the transpose of a 
one-column matrix), 


4 du d, 
Am|9n 9792 ?2 


Am Anz Ang 


b 
Y= |y]. 
3 


The matrix A, which has just the same entries as the array of coefficients 
in Equation (1), gives us a matrix representation of the bilinear form 


(Qr, +++ Xn) On Ya Ys) > Cost. 


This use of matrices should not be confused with the use of matrices to 
represent linear transformations. À matrix is a device for storing informa- 
tion; if the information falls naturally into a rectangular array, then it 
can be stored as a matrix. But there are several different sorts of informa- 
tion that can be stored this way, and whenever a matrix is used, one must 
know what it is being used for. This is especially true if there is any ques- 
tion of changing bases in the underlying vector space(s), since the effect 
of a change of basis on the entries in the matrix will depend on the use to 
which the matrix is being put. 


and 


If U and V are m- and n-dimensional spaces respectively, then we know 
that a linear transformation from V to U can be represented by anm x n 
matrix. Presently we shall see that a bilinear form on U x V can also be 
represented by an m x n matrix, and that if P is a matrix of transition in 
U, Q a matrix of transition in V, then the formula for going from a 
matrix B representing the bilinear form in the old bases, to B' representing 
it in the new bases, is 
B’=P™BOQ 

Compare this with the formula relating two matrices (call them B and B’ 
again) which represent a linear transformation from V to U with respect 
to the old and new basis (see sub-section 3.1.4 of Unit 3, Hermite Normal 
Form) 


B' = P-1BQ 
READ from line —5 of page N156 to line 15 of page N158. 


Note 


line —7, page N157. The formula for change of basis can be derived rather 
more easily than via the work leading to Equations (8.4) and (8.5). We saw in 
Unit 3 (page N50), that if X is a one-column matrix representing a vector in the 


LM 14.1.2 


vector space U with respect to the old basis, and P is the matrix of transition to 
a new basis, then the one-column matrix X’ representing the vector with respect 
to the new basis, is related to X by 


X= PX’. 


Similarly, the one column matrices Y and Y’ representing a vector in the space 
V are related by Y= QY', where Q is the matrix of transition in V. 


Then, by Equation (8.3) on page N157, we have 
fla, B) = X'BY 


= (PX)'B(QY) 
-(X)'(P'BQ)Y' 


Equation (8.3) defines the matrix representing the bilinear form in any bases, 
so it follows that P"BQ is the new matrix representing f. 

Example of Matrix Representation 

The determinant function of R? is a bilinear form.* As we have seen 


earlier (Unit 5, Determinants and Eigenvalues, page K681,) it is specified by 


X 
det (Qs vom [2 |= ia = maa 


and since this is equal to 


t df] 


the matrix representing this bilinear form is 


0 1 
-1 0 
with respect to the standard basis in R?. 


Example of Change of Bases 


Let U = V = R’, and let f be the “geometric” inner product function 
specified by 


SG) = x1 y1 xii xi, 


where ¢ = (x,,35, x3) and n = (y, y2, Y3). Then the matrix of f with 
respect to the standard bases in U and V is 


100 
B=]0 1 0f, 
001 


because f(¢, n) =[x, x2 xB, yi. 3). 
Now let 


(1, 0, 0,), (1, 1,0), (1, 1, D) = (a, a2, a3} 


be a new basis for U and V. The matrix of transition (whose columns 
represent the new basis vectors with respect to the old ones) is 


1 ) 1 
P=|0 1 1 
001 


and the matrix of f with respect to {a,, az, a3} is 


1 0 07/1 0 gfi 1 1 111 
B’'=P™BP=|1 1 ojjo 1 offo 1 1]=]1 2 2|. 
1 1 1jfjo o 1j[o 0 1 123 
* The determinant function on R", for n> 2, is nor bilinear. It is, in fact, " n-linear", or 


“multilinear”. 


10 


Exercises 


l. Exercise 1, page N159. (First see line 11, page N157) 


2. Prove that congruence is an equivalence relation on matrices (see 
page N74). 


Hint: recall that 


(a) The identity matrix is non-singular. 
(b) The inverse of a non-singular matrix is non-singular. 
(c) The product of two non-singular matrices is non-singular. 


Solutions 


l. 


PA m 
AU Ur Ja 


Since x,y, + 2x,y2 — x1 — x1 yi + 6xiy = 


3. 


the matrix is 
1 2 6 
-1 -1 Oj. 
Let '* B' ~ B” mean “ B’ is congruent to B". 


We must prove three things: 
G) B^B. 
Gi) If B'— B, then B~ B. 
(ii) If B'^ B and B' ~ B', then B’ ~ B. 
We prove (i) by noting that B = IBI, and / is non-singular. 
We prove (ii) by noting that if B'~ B, then B' = P"BP, 
where P is non-singular. Thus 
(P7) B'P^! = (P^)7PTBPP^! 
= (PTD)*!PTBPP^! 
= IBI 
=B 
and P^! is non-singular. Thus B ~ B’. 


We prove (iii) by noting that, if B’ = PTBP and B* = Q"B'Q, 
where P,Q are non-singular, then 


BY = Q'(P"BP)Q 


= (PQ)"B(PQ) 
where PQ is non-singular. 
Thus 
B'~ B. 


11 


14.1.3 Symmetric and Skew-symmetric Forms 


The next idea we consider is that a bilinear form can be symmetric. What 
does this mean? We saw in Unit M 100 30, Groups I, that objects can have 
various sorts of symmetry, depending on what sort of transformations 
leave them invariant. For example, thc shape in the left-hand figurc is 
symmetric with respect to a reflection about the vertical axis shown, and 
the shape in the right-hand figure is symmetric with respect to a rotation 
of 180° about the point shown. 
> 


The graph of the function f: x — x? is symmetric with respect to 
reflection in the y-axis. 


y 


In this case the symmetry can be expressed algebraically by the equation: 
SK- (eR. 


Functions with this property are said to be even. For example, the cosine 
function is even. This sort of symmetry is interesting from our point of 
view, in that we can form a complementary concept, that of anti-symmetry, 
expressed by the equation 


g(x) = -8(—-x)  (xeR) 


Functions with this property are said to be odd. For example, the sine 
function and the function g: x ———— x? are anti-symmetric or odd. In 
fact, any polynomial function all of whose non-zero terms are in even 
powers of x, is symmetric, and any polynomial function all of whose 
non-zero terms are in odd powers of x, is anti-symmetric. 


{xyi y=xtx?} 


What do we know about an arbitrary polynomial function of x? Precisely. 
that it is the sum of its odd terms and its even terms. For instance, 


h: x> 1-x+2x?-4x5 (xeR) 
is equal to f + g, where 

f: x— | + 2? 
is even and 

g: xm —x- 4x5 
is odd. This is not at all difficult to see: what is interesting is that any 
function A: R ——> R can be expressed as the sum of a symmetric part, 
J, and an anti-symmetric part, g. For, any combination 

x———— ah(x) + ah(—x) | aeR, 
is symmetric, and any combination 

x ———. bh(x) — bh(—x) beR, 
is anti-symmetric. The expressions for f and g are found by choosing a and 
b so that f+ g = h, i.e. 

F(x) = HA) + A(-)) (xeR 

g(x) = HA(x) —A(-x)) (ER). 


Example 
The function x .———» e* can be expressed as the sum of the symmetric 
part 

cosh: x ———— (e + e^?) 
and the anti-symmetric part 

sinh: x —— ¿(æ — e^?) 
The way of splitting A into symmetric and anti-symmetric parts is unique; 
however it is done, the same f and g are obtained. 


Exercise 


Prove that if h = k + l, where k is symmetric and / is anti-symmetric, then 
k — f and l = g. (The functions, f, g, h are as above.) 


Solution 
A(x) = k(x) + I(x) (xe R) (1) 
so that 
h(-x)-k(-x)*ti(-x)  (xeR) 
= k(x) — I(x) (2) 


because k is symmetric and / is anti-symmetric. Solving Equations 
(1) and (2) for k and /, we find that k — f and 1 = g. 


13 


We now consider symmetry for bilincar forms. Since 


Sla, B) 2 f(-a, — B) 


is a tautology (f is linear in a and f separately), the obvious extension 
of the definition of symmetry for functions of one variable gets us nowhere. 
The definitions we consider are not directly analogous to those discussed 
above for functions of one variable, but the results are very similar. The 
definitions involve interchanging the two vectors in the ordered pair 


(a, f). 
Definitions 


A bilinear form f: U x U —— F, is symmetric if f (a, B) = f (B, a) for 
all a, Be U. 


A bilinear form g: U x U ———» F, is anti-symmetric if g(a, B) = —$(f, a) 
for all a, B e U. 


Examples 
The determinant function on R? 
(x1, x2), Or 33) — 9 xiya — X1Xi 
is antisymmetric. 
The function 
(G15 x2). 0 32) 9 X12 xii 


with domain R?, however, is symmetric. 


Exercise 


Find expressions for the symmetric and anti-symmetric parts of a general 
bilinear form 4: U x U ——> F. 


(Hint: the method is similar to that for real functions.) 


Solution 
The symmetric part is 
f: (e, B) — 3, B) + HB, a)) ` (@, Be U) 
and the anti-symmetric part is 
8: (œ, P) — — hla, B) — hP, 2) — (@, BeU) 


But wait! This is all right if F = R, but for a general field F, what do we 
mean by 3? What we mean, of course, is the (multiplicative) inverse of 
the element 1 + 1, which exists as long as 1 + 1 # 0, the additive identity 
element. But for a general field F, we have no guarantee that 1 + 1 #0; 
and in fact there is one field where 1 + 1 — 0, namely the field consisting 
of the two elements 0 and 1, with addition and multiplication tables: 


0 1 
0 
1 


Holt 


1 
0 


In this field there is one other thing to settle: what do we mean by the 
minus in, say 


g(a, B) = — (f, a)? 


If we write it in the form g(a, ff) + g(f, a) = 0, then there is no problem, 
and we recognize that — g(fi, a) is the (additive) inverse of g(a, D). In our 
present field we have only two elements and 


0+0=0, 1+1=0 


so that 0 = —0, 1 = —1. Hence our definition for anti-symmetry can be 
written 


g(a, B) = g(B, a), 


i.e. it is the same as symmetry. We can therefore conclude that, for this 
sort of field, a bilinear form is symmetric if and only if it is anti-symmetric! 


Because of this anomaly, (and it isn't any more than that; in general we 
shall ignore it) N defines skew-symmetry, which is the same as anti- 
symmetry when 1 +10, but provides additional information about ` 
the bilinear form in the case in which | + 1 = 0. 


Definition 


A bilinear form g: U x U ———» F is skew-symmetric if g(a, a) — 0 for 
all « e U. 


If a bilinear form is skew-symmetric, then it is always anti-symmetric; 
for the definition of bilinear form implies, for all «, f € U, that 


gla + B, a + B) = g(a, a) + g(a, B) + g(B, a) + (B, P), 
and the definition of skew-symmetric implies 

gla + P, a + B) = g(a, a) = g(f, P) = 0; 
so the two together give 

gla, B) + g(B, a) =0 
or equivalently, g(a, f) = —g(f, a). 


On the other hand, the converse only works if 1 +1 #0. If g is anti- 
symmetric, then taking the general equation 


g(a, B) = —&(B, a) 
and letting a = fl, we get 


g(a, a) = —g(«, a); 
g(x, a) + g(a, a) = 0; 
(1 + 1)g(a, a) = 0; 


so that, if 1 + 1 #0, we deduce g(«, a) =0 and the form is skew-sym- 
metric. 


Thus, as long as 1 + 1 # 0, there is no difference between the concepts of 
anti-symmetry and skew-symmetry. As we deal exclusively with R or C 
as the fields in all applications in this course, you need only remember 
one of the definitions. 


READ from line 16 on page N158 to the end of Section IV-8 on page N159. 


Notes 


(i) Itis important to note the “if and only if” in the statement of Theorem 8.1, 
and the corresponding two parts of the proof. Also notice that in the skew- 
symmetric case there is no “‘if and only if” theorem because of the possibility 
of 1 + 1 = 0. Hence Theorems 8.2 and 8.3. 


(ii) Note the definitions of symmetry and skew-symmetry of matrices that 
occur in the course of the proof of Theorem 8.1, and immediately after the 
proof of Theorem 8.3. Of course, a non-square matrix cannot be symmetric or 
skew-symmetric; nor can a bilinear form on U x V if V # U. 


15 


Exercises 

l. Exercise 2, page N159. 
2. Exercise 3, page N159. 
3. Exercise 4, page N159. 


Solutions 
1. The symmetric part is 


if! 23) apr 4 1 3 
204 5 6|*5|2 5 8) =]3 5 7 
217 8 9 369| [579 


The skew-symmetric part is 


fi 23) a4 0 -1- 
i14 se|-i2ss|-h o -1 
2178 9|] 25365] [2 1 0 


2. The transpose of P7BP is the product of the transposes of 
P7, B and P (see Unit 3, sub-section 3.2.4), reversed in order, 
i. 
(PT BP)? = (P) B'(P")' = PTBTP. 
Thus, if B is symmetric, then B" = B and so (P' BP)" = PTBP, 
i.e. P BP is symmetric. If B is skew-symmetric, then B” = — B, 
so that 
(PTBP)! = P'(— B)P = —(P" BP), 
i.e. PT BP is skew-symmetric. 
3. If A is an m x n matrix, then ATA is the product of an n x m 
` and an m x n matrix. This product is defined, and is n x n. 
Similarly, the product AAT is defined, and is m x m. Further- 
more, (474)! = (A'(AT)" = ATA, so ATA is symmetric. 
Similarly, 447 is symmetric. 


14.1.4 Summary of Section 14.1 


In this section we defined the terms 


bilinear form (page N156) 
symmetric bilinear form (page N158) 
anti-symmetric bilinear form (page C14) 

skew-symmetric bilinear form (page N158) 
symmetric matrix (page N158) 
Skew-symmetric matrix (page N159) 
congruent (page N158) 

Theorems 


l. (8.1, page N158) 

A bilinear form f is symmetric if and only if any matrix B representing f 
has the property B7 — B. 

2. (8.2, page N158) 

If a bilinear form f is skew-symmetric, then any matrix B representing f 
has the property B7 — — B. 

3. (8.3, page N158) 

If 1 + 1 #0 and the matrix B representing f has the property B" = — B, 
then f is skew-symmetric. 

4. (8.4, page N159) 

If 1 + 1 #0, every bilinear form can be represented uniquely as a sum of 
a symmetric bilinear form and a skew-symmetric bilinear form. 


Technique 
Given a bilinear form f, find f,, fys. 


Notation 
f(x, B) ^ (page N159) 
fat, B) ^ (page N159) 


16 


so 


+*+.» 


» +» +++ r 


+e ee 


+ + 


14.2 QUADRATIC FORMS 
14.2.1 Definition 


We are now in a position to look at certain non-linear functions from a 
vector space to its field, and analyse these in terms of bilinear forms, thus 
giving us the opportunity to use linear tools. 

Definition 


A quadratic form on a vector space V is a function q: V ——> F, such 
that 


qla) =f a) (eV) 


-for some bilinear form f on V x V. (Throughout the rest of this unit, the 
bilinear forms are always on V x V ratherthan U x V; so we can compress 
the notation and call them bilinear forms on V.) 


A quadratic form is not a linear form (except in the trivial case where it is 
the zero mapping); the following are among the simplest examples of 
quadratic forms. 
Examples 
1. The function q, where 
q(x)=x? (xeR) 
is a quadratic form on R, obtained from the bilinear form f, where 
Se y=xy  GQyeR) 
since f(x, x) = x?. 
2. The function Q, where 
Q(x, x2) = xi *2xx; — xi (xi, x2) e R?) 


is a quadratic form, since it can be obtained from the bilinear form F, 
where 


F(Q, x2), Wis 92) = X1 yı + 2x1 ya i2 
(Œr x2), 1 X2) € R?). 
To discover the quadratic form q corresponding to a given bilinear form 
f, we simply use the equation g(a) = f (a, «). For example, if 
Sli x2), 01. 2) = 2xiy + 3X2 yi ((x1, x3. (Wis X2) € R?), 
then the corresponding quadratic form is specified by 


qx, x2) 7 f (G3 x2) 65. x3) 
= 2x,x, + 3x2 x2 
bad 2xi + 3x3 ((x1, x1) € R?) 


Similarly, if f is the bilinear form on C[0, 1] specified by 
1 
Se h= | shade (he IO. 1D, 


then the corresponding quadratic form is specified by setting g = A. 
glg) = f (8, 8) 


" I () dx — (ge C10, 1). 
o 


17 


Exercises 


1. Write down the quadratic forms corresponding to the following 
bilinear forms*: 


() xpi + xiyà (x1; x3, 0, y) e R?) 

Gi) Xyz + x21 (Cxi x2), 0n. ¥2) € R?) 

(ii) 2x,y; (X15 x2) (Yis Y2) € R?) 

(iv) |x p» | (93, x2), 0 ¥2) e R?) 
Xi yi 


v [ "feu -3)dx — (ge C10, ID 
o 


vi) Í fde ge C10, 1) 
o 


2. Let q be any quadratic form on a vector space V, whose field is R. 
Show that, for any a e R and any vector a € V, 


4(a2) = a'q(a). 
Solutions 
l. In abbreviated form the quadratic forms are: 
0 xtX (G5, x2) € R?) 
(i) 2x,x, (x4, x2) € R?) 
(iii) 2x,x; (x, x2) e R?) 
M ja akeo — uze 


v [rea - 94 (fe Cfo, 1) 


1 
(vi) f XU (x)]? dx (fe cto, 1) 
o 
2. Since g is a quadratic form, we have 
q(aa) = f (aa, aa) where f is bilinear 


5 e since f is a bilinear form 


= a’q(a). 


14.2.2 The Polar Form of a Quadratic Form 


In parts (ii) and (iii) of Solution 1 of the previous sub-section, we saw that 
two different bilinear forms, Xiyyi +X2y, and 2x,y, give rise to the 
same quadratic form 2x,x,. However, only the first of the bilinear forms 
is symmetric. The next reading passage shows that this is a special case of 
a general result: E 


any given quadratic form g can be obtained from many different 
bilinear forms f, using the formula 

qla) = f (a, a) 
but only one of the bilinear forms is symmetric. 


READ Section IV-9 on page N160 to the end on page N162. 


* We have abbreviated the notation, writing, for example 
Xa Quy € R) 
in place of 


Gu) xy (x eR). 
This practice is adopted by N. 


Notes 


(i) Paragraph preceding Theorem 9.1, page N161 
Note the suggestive mnemonic: 


xy = a(x + y)? — x? — y’). 
This gives a helpful way of remembering the formula 
Aa, B) = tala + f) — ala) — 9(B)}. 


(ii) line — 10, page N161 By the matrix representing a quadratic form g we 
mean a matrix A such that 


g(a) = X'AX (eV) 


where X is the one-column matrix representing a. This matrix also represents 
the bilinear form f defined by 


f(a, B)=XTAY (a,BeV) 


“where Y is the one-column matrix representing f. This bilinear form is one of 
those with the property 


qa) =f, a) (eV) 


and if A is symmetric f is the polar form of q. 

(iii) line —6, page N161 to the end of the section The details of the geometrical 
interpretation are not important. The geometry involved is not part of the 
course. 


Example 
If q is the quadratic form specified by 

Q(X, x) = xb xxi 2xi — (x, x2) e R?), 
then in matrix notation, 


qGx, x2) = [xs mal | [k] = XTAX 
2 


so A is a matrix representing g. A also represents the bilinear form specified 
by 


S15 x2) Ors 2) = Xy, + 4x1y2 + 2x2 yi, 


which is a bilinear form (not symmetric, however) such that q(«) = f (a, a). 
On the other hand, the symmetric bilinear form corresponding to q, i.e. 
the polar form, is by Equation (9.1) on page N161 


SA (ay x2). Was Y2)) = X131 + 2x1ya + 2x1 + 2x3 ya 
which has the matrix 


a=b j 


and it is still true that 
Q(X, x3) = XT A, X. 


Exercise 


Write down the polar form of each of the following quadratic forms. 


(i) xi (x, € R) 

Gi) x? +x} ((x1, x2) € R?) 
(üii) x? + 2xyxq + 3x2x3 +x} — (Qu, x, x3) € R°) 
(iv) f SSN — x) dx (fe CIO, 1). 


19 


Solution 
O Aæ y) = Hay  )) 7 941) - 90121 
=H, +)? — xi- yi] 
=X) 
() SG x), Or 92) = Hae yix +92) 
—465, x1) — 01, Y2)] 
= F(x, + y)? + (x2 + 2)? 
-xp-Xxi- yXi- yi] 
=X tX) 
Gi) SXi x3, x3), Ons Y2» Ys) 
= dleGa + Yi, X2 ya xs + y3) 
— 408, X2, x3) — 40 Y Y3)] 
=H +y)? + 264 + y +y) 
+ 3(x2 + Y2)(x3 + y3) + (x3 + y3)? 
= x} — 2xyx2 — 3x2 x3 — x3 
— Yi - Wy - 3y2¥3 — y3) 
HH + Xyz + X21); 
+ $223 + 4x52 + X33. 
(v) If we let Q be the quadratic form given, and F the polar 
form of Q, then, for f, g e C[0, 1]: 


FU, 8) = MQU + 8) - Q0 - O@)1 
=f v0) e sera - 9 +40 - ade 
- [irora - 94 
= f &G9)g(1 — x) ax} 


- f LEl — x) + ESC — )] dx. 


In general, the polar form is not the simplest bilinear form corresponding 
to a given quadratic form. For example, in part (iv) of the last exercise, a 
simpler bilinear form is 


Hig) = f SORU — x) dx 
and we still have 
QU) = HON). 


You may have noticed that in parts (i), (ii) and (iii) of the exercise there is 
a rule of thumb which gives the answer considerably more speedily than 
by applying the formula 


Sle, B) = 3(a(« + B) — a(2) — q(B)]. 


This rule is: every term of the type x? in the quadratic form becomes 
x.y; in the polar form, whereas each term of the type x,x, (i # j) in the 
quadratic form becomes (x.y, + x,y) in the polar form. This rule always 
works for quadratic forms which are expressed in terms of coordinates 
with respect to a basis. However, to do (iv) above without guesswork, one 
needs a formula which is not expressed in terms of a basis. 


20 


Exercises 


1. Write down the polar forms of each of the following quadratic forms, 
using the above rule. 


G x4x2 + x2 x5 s (Or, x2, x3) e R?) 
Gii) xi + xx +33 (x1, x2, x3) e R?) 
Gii) -x2 +x, x, ((x1, X2, x3) e R°) 


(iv) xi + 4dxyx2 - xi + dx, x3 + 2x} (Q5, x2, x3) e R?) 


2. Write down the symmetric matrices representing each of the quadratic 
forms of Exercise 1. 


Solutions 
1. (Q3, x2, x3) and (y, Y2, 3) € R? throughout.) 


G) yz + xiyi  xiys + x3 y3) 
G) x. + (xyz x1) xs 
Gi) xy, + Hays + x3 y1) 

GV) xix + 30a + xi) — xis 


+ 20023 + x3 y2) + 2x Y3 
040 
HT 
040 


(i) fl 4 0 
i00 
001 
Gi) [-1 0 0 Gv) [1 io 
00 4 ł -1 2 
010 0 42 
14.2.3 The Quadratic Taylor Approximation 
In this sub-section, we will apply quadratic forms to the problem of 
approximating complicated functions by simpler ones. In sub-section 


14.3.4, we use such approximations to classify the stationary points of 
functions of two variables. 


In Unit M100 14, Sequences and Limits II, we saw thata function f of 
one real variable can be approximated by polynomials, provided it and 
its derivatives satisfy suitable conditions. The higher the degree of the 
polynomial, the better the approximation. These approximations are 
known as Taylor approximations, and are obtained by calculating the 
derivatives of the function at a suitable point. If we choose the point xo, 
then the nth-degree Taylor approximation to f(x) in the neighbourhood 
of x, is given by: 


um 
Ho) + (x = x9f'G) +H" p + 


NE (Y feug) 


Now there is a similar approximation process for functions of more than 
one real variable (i.e. for functions with domain R?, R?, ...). The formu- 
las in this case are expressed in terms of partial derivatives, which you met 
in Unit M100 15, Differentiation IT, for functions of two real variables. 
We give a brief résumé of the concept of a partial derivative below. 


If f is a function with domain R? and codomain R, let us for convenience 
call the first variable x, and the second variable x;. For instance, 


f: (41, x4) —— — e" + x sin x; 


is a function of the sort we are interested in. 


21 


Then the first partial derivatives are obtained by fixing one of the variables 
and considering the derivative of f considered as a function of the other 
variable only. In symbols* 


fei +h, x2) - fo x2) 


Si(%1, x2) = lim h 
: a~o 
, h)- , 
falx x) lim feo xi + 2 f. x2) 


Second partial derivatives are similarly defined: fi, = (fi), fia = i)z» ete. 
For instance, in the case of f above: 


fin, x2) = €" + sin x; 
(since all we have to do is to differentiate with respect to x,, regarding x; 
as a constant). 
Similarly: 
fa, x2) = x1 COS x; 
The partial derivative of f, with respect to x, is 


Au x3) = e". 
Also, 


fiac x2) =C08 x3 
fai, x2) = COS x; 
fiia, x1) = —x, Sin x;. 
Notice that in this case, fi; — f?,. This will always be true in the cases 


which interest us. (In fact, the condition that all the second partial derived 
functions exist and are continuous, is enough to guarantee that f,; = 21.) 


Exercises 


l. Find all the first and second partial derivatives of the following func- 
tions, where the domain is R? in each case. 
(i) fiu, x) — — 3x, x2 + 5x3 + 3x1x;. 
(i) g: (x1, x2) — — x, exp (x, + x2) 
(iii) A: (x4, x2) — —9 x, cos (2x, + x2) 


2. Find 4,(0, 0) and ^a. 0) where A is as in part (iii) of Exercise 1. 


Solutions 


l. (i) ily, x2) = 3x2 + 9xix, 
fi, x1) 7 3x, + 10x, + 3x1 
fu, x) = 18x,x; 
S215 x3) = Sai, x2) = 3 + 9x} 
fao X2) ='10 


Gii) exe, x2) = Ga + 1) exp (x, + x2) 
82(X1, X1) = xy exp (x, + x2) 
iio x2) = (x, + 2) exp (xı + x2) 
812% X1) = 8105, x1) = (x, + 1) exp (x, + x2) 
Sai x2) = x, exp (x, + x2) 


» Note that we have dropped the prime from fi, f}, the forms used in the Foundation 
'ourse. 


(i) AG, x2) = cos (2x, + x2) — 2x, sin (2x, + x2) 
h(x, xj) = —x, sin (2x, + x2) 
hy (1, x7) = —4 sin (2x, + x2) — 4x, cos (2x, + xj) 
Aya, x) = ha, x2) 
= —sin (2x, + x2) — 2x, cos (2x, + x;) 
hii, X2) = — x, cos (2x, + x2) 


2. h,(0, 0) = cos (0) — 2 x 0 x sin (0) 


We are now ready to look at the Taylor approximation method in two 
variables. The most general polynomial function of first degree in two 
variables has the form 


P: (Xi, X2) — a + bx, +x, (x x2) € R?) 


and the first-degree Taylor approximation to f will be a first-degree poly- 
nomial function of this sort. The approximation about (0, 0), say, will be 
obtained by requiring the image of p and its first partial derivatives at 
(0, 0) to agree with the image of f and its first partial derivatives at (0, 0). 
These three conditions give just enough information to determine the 
three numbers a, b, c: 


(0,0) =a, 


S,0, 0) =b, 
F,(0, 0) =c. 


Exercise 


Show that the above choice of a, b and c makes the image of the function 
and its first partial derivatives at (0, 0) agree with the image of the ap- 
proximation and its first partial derivatives at (0, 0). 


Solution 
The approximation is the polynomial function p specified by 


Bs, X2) = f (0, 0) + x, (0,0) + x2/2(0, 0) 
(G1, 2) € R?). 
Thus p,(x,, x2) = f,(0, 0) (x1, x1) e R?) 
Pa(X1, x1) = f2(0, 0) ((,, 2) e R?). 


Then, setting (x,, x2) = (0, 0) in the above three equations, we 
get 


p(0,0) =f(0, 0), 
Pi(0, 0) = f,(0, 0), 
P2(0, 0) = f,(0, 0). 


Thus the first-degree Taylor approximation is 


f, x) = f(0, 0) x, 40, 0) + 22,00, 0). 


Geometrically, this corresponds to approximating the three- 
dimensional surface x; = f(x, x2) by its tangent plane at the 
point (0, 0, f(0, 0)), in the manner explained in Unit M100 15. 


x, the surface xs- f(x, x2) 


.the tangent plane 


at point (0,0,f(0,0) 


Ru 


X, 


The same method can be extended to give higher-degree Taylor approxi- 
mations. For instance the second-degree Taylor approximation to f about 
(0, 0) can be obtained by writing out the most general second-degree 
polynomial in x,, x2: 

F(X,, X2) = A + Bx, + Cx; + Dx} + Exx, + Fx} 


We now choose the numbers A, B,..., F to make the images of r and f 
agree in their values at (0,0), and in the values of the first and second 
partial derivatives, all at (0, 0). 


Exercise 


Verify that the appropriate values to take are: 


A-f(,0, B=f,0,0), C=f,0,0), D-ifi(00. 
E=f,2(0,0), F =4f22(0, 0). 


Solution 
With these values, the approximation is given by: 
1X45 X2) =F(0, 0) + 1,0, 0) + x2f2(0, 0) 
+ 4x5 fua (0, 0) + x1x2f12(0, 0) + 4x f22(0, 0). 


Differentiating: 


ry, X2) = fi, 0) + x1f110, 0) + x27120, 0) 
r0, X2) =f2(0, 0) + x, fi 2(0, 0) + x2f220, 0) 
7,105, x?) = f0, 0) 
T1565. X2) — fia (0. 0) 
T3505, x2) =f22(0, 0) 


and so the image of r and its first and second partial derivatives 
at (0, 0), agree with the image of f and its first and second partial 
derivatives at (0, 0). 


24 


LM 14.2.3 


Thus, the second-degree Taylor approximation to f(x, x2) about (0, 0) is 
f (%1, x;) = f(0, 0) 
+ x,/(,0) + x; (0, 0) 
*dxifu(0, 0) + x1x2/12(0, 0) + 312 /22(0, 0). 


We have written this expression on thrce lines because it breaks up into 
three distinct parts. On the first line, we have a constant, on the second 
line, a linear functional on R?, and on the third line, a quadratic form on 
R?. Thus we can write 


£03, x2) /(0,0) + PG, x2) + QGx x3). (x x2) € R?) 
where P is the linear functional specified by 

P(x,, x2) = x, fi(0, 0) + x; f,(0, 0) 
and Q is the quadratic form specified by 

QGx, x2) = 427/100, 0) + x1x2f,2(0, 0) + $x3/22(0, 0). 


The linear functional P involves the first partial derivatives of f at (0, 0), 
while the quadratic form Q involves the second partial derivatives at 
(0, 0). 


The polar form of Q is given by 


G((X1, x2), Wr 3) = 42114110, 0) + 321; (0, 0) 
+ 51i, 0) 
+ 4x22 f22(0, 0) 
((%1, x2} Or, X2) € R?) 
= Yo fu (0, 0) + xy; fii (Q0, 0) 
+ xiyxifai(0, 0) + x2 Y2 f22(0, 0) 
since fz21 = fi;. 


We can write this 
2 


Gn x). 0n ID 5. YE, IAO 


Exercise 

Write down: 

(i) the first-degree Taylor approximation about (0, 0) to 
sin G +x + xj) (x1, x2) € R7); 

(ii) the corresponding second-degree approximation. 


Solution 
The first and second partial derivatives of 


3 L4 
fx x) -sin G tx x) are: 
Ail, x) = cos G +x + x) fa, %2) = cos G +x + x) 


fuu, x1) = —sin G +x + xj) fi x2) = —sin G +x + xj) 


fai x2) = —sin G txt 2 


Thus: 


Gi) sin (F +x, +x) 0,0) 2/400 + x250, 0) 
QR x Li 
amg +x Cos 7 + x2 cos 7 


1 
= —= (1 + x, + x2) 
J2 
n 


(ü) sin G tx + x) = f(0, 0) + x, f,(0, 0) + x2f2(0, 0) 


1 2 
+ 2 xifai(0, 0) + x1x2f12(0, 0) 


1 
+5 *2f12(0, 0) 
.R n n 
= sin 7 + xy cos 7 + x2 Cos 7 
li g^ x sin 2 sin © 
7238047 xx Sin. Xa Sin 4 


1 I 1 
$ v2 (1 +x, +x -3d- 1X2 -$4) 


14.2.4 Summary of Section 14.2 


In this section we defined the terms 


quadratic form (page N160) 
polar form (page N161) 


Theorem 


(9.1, page N161) 

Every symmetric bilinear form f, determines a unique quadratic form by 
the rule g(a) = f,(a, «), and if 1 + 1 #0, every quadratic form determines 
a unique symmetric bilinear form f,(a, f) = 4(q(a + f) — ala) — q(8)] 
from which it is in turn determined by the given rule. There is a one-to-one 
correspondence between symmetric bilinear forms and quadratic forms. 


Technique 


Given a particular function of two variables, f(x,, x), find the quadratic 
Taylor approximation using the formula: 


2 2 
fex) $0, 0+ È LAO 5. Y x xfi(0,0) 


Notation 
(x) (page N161) 


26 


143 THE NORMAL FORM 
14.3.0 Introduction 


We saw in sub-section 14.1.2 that if two square matrices A and A’ repre- 
sent the same bilinear form with respect to different bases, then A’ = P' AP 
where P is a matrix of transition. The matrices A and A’ related in this 
way are said to be congruent and we verified in Exercise 2 of sub-section 
14.1.2 that congruence is an equivalence relation. This means that once 
again it is useful to define a normal form for a matrix, namely the 
"simplest" matrix in the equivalence class.* As in the case of Hermite 
normal form, we shall see that the way to compute the normal form is to 
work in stages, making the matrix progressively simpler. 


This particular type of normal form is, of course, only useful for a matrix 
that represents a bilinear form: the normal form appropriate to a particu- 
lar discussion depends essentially on the effect of changes of bases. The 
Hermite normal form arose in the context of the representation of a 
linear transformation by a matrix, where we considered change of bases 
in the codomain only. The Jordan normal form arose in the same context, 
but we were considering linear transformations of a space to itself. Here 
the context is different: we have found that we can represent a bilinear 
form (with respect to some basis) by a matrix and the change of basis has 
an effect on the matrix which is different from the previous two. 


To illustrate the usefulness of normal forms under the congruence rela- 
tion, we will demonstrate what happens in the case of matrices represent- 
ing symmetric bilinear forms on R?. 


Apart from the zero normal form, the normal forms that arise are 


G) [0 i (i) b ol; ii) fo zl 


(iv) [-1 0]. 
0 -I 
If for each bilinear form q we consider the graph of the equation 
q(xi, X2) = 1, then the first of the above normal forms corresponds to a 
circle or ellipse, the second to a pair of straight lines, the third to a hyper- 


bola, and there is no graph corresponding to the fourth normal form; 
in this case the solution set of q(x,, x2) = 1 is empty. 


Exercise 


Suppose that the above matrices represent bilinear forms with respect to 
the standard basis in R?, and that a variable point in R? is denoted by 
(xi, x4). Write out the equation 9(x,, x2) = 1 in terms of x,, x; in each 
case, and draw its graph. 


* You may find it of interest to read Section Il-9, pages N74-78, which discusses the concept 
of norma! form in general; we now have a few examples. We also discussed the general 
concept in Unit 10, Jordan Normal Form. 


27 


Solution ; 
(i) +s. Gi) x7=1, ie. x = 41. 


X2 
x 
T A CENE 
(üi) x? — x} = 1. (iv) —x2—x2=1. 
X2 X2 
Xx » 


(empty set) 


14.3.1 Getting the Matrix Diagonal 


The first step which we take in reducing a symmetric matrix to its normal 
form is to diagonalize it. (We consider symmetric matrices only, since 
among the many bilinear forms corresponding to a quadratic form, a 
symmetric bilinear form can always be chosen.) 


Section IV-10 of N contains a theorem showing that any bilinear form 
can be diagonalized and an algorithm for doing it. We do not expect you 


to use this method, but the theorem is the subject of the next reading 
passage. 


READ Section 1V-10 starting on page N162 as Sar as line —5 (the end of 
the proof) on page N163. 


28 


Notes 


(i) line 10, page N163 In less technical language, the proof shows that if 
n> 1, we can reduce the dimension of the problem by 1. Since we can repeat 
this technique until we get down to a 1-dimensional problem, and since a 1 x 1 
matrix is automatically diagonal, we know that we can in principle tackle an 
n-dimensional problem for any finite n. 

(i) Equation (10.1), page N163 It is here that we use the fact that the matrix 
is symmetric. 

Gii) line 18: the sentence after Equation (10.1), page N163 This is really a 
“proof by contradiction” in miniature. If we assume q(x) = 0 for all a, then the 
matrix representing it with respect to any basis is 0. But we assumed a few lines 
back that B 4 0. So there must exist an « such that q(x) +0. 

(v) line 21, page N163 The linear functional is o —— —3À fai, x). 

(v) line 23, page N163 Since the linear functional is not the zero function, 
its rank is 1. The dimension of its domain is n, and therefore the dimension of 
its kernel W, is 1 — 1 (by Theorem 1.6 on page N31). 

(vi) line 24, page N163 “f, restricted to W,” means the bilinear form 


(a, B)  — — fe, p) (x € Wi, Be Wi) 


with domain W, x W, instead of V x V. 

(vii) line 25, page N163  '* by assumption "—that is, by our inductive hypo- 
thesis that the theorem has already been established for bilinear forms on spaces 
(here W,) of dimension n — 1. 

(viii) line 26, page NI63 The notation “2 < i j&n” means “2<i<n and 
2n. 

(x) line 27, page N163 In other words, the part of the matrix below shown un- 
shaded contains only zeros (except for dı) because we chose a@3,..., Q4 in the 
set W, = (o: (a1, à) = 0). The small square contains only the one entry di, 
and the large square is the matrix of the bilinear form “f, restricted to Wi", 
and is therefore diagonal by our inductive hypothesis. 


We next consider two methods of obtaining a diagonal form of a sym- 
metric matrix, B: 


(a) the method of elementary row and column operations, 
(b) the method of completing the square. 


Elementary Row and Column Operations 
If P is the matrix of transition, the diagonal form B’ is given by 
B' = PT BP. 


Since P is non-singular, it is the product of elementary matrices. We can 
therefore get from B to BP by applying the corresponding column opera- 
tions. Similarly, P" is the product of elementary matrices, and we can get 
from BP to P"BP by applying the corresponding row operations. 


29 


LM 14.3.1 


Now it turns out that the required row operations correspond exactly to 
the required column operations, and must be performed in the same order. 
To see this, suppose that P is written out as a product of elementary 
matrices: 


P=E,E,...E,. 


Then to find P", we must multiply the transposes of the elementary 
matrices together in the opposite order: 


P” =E]... ETET 
Thus: 
B' = EL... ETETBE,E, ... E,. 


Now it is not difficult to see that if multiplying by E, on the right corre- 
sponds to multiplying column i by c, then ET — E,, and multiplying by ET 
on the left corresponds to multiplying row i by c. Similarly, if multiplying 
by E; on the right corresponds to interchanging columns i and j, then 
E} = E, , and multiplying by ET on the left corresponds to interchanging 
rows i and j. Finally, if multiplying by E, on the right corresponds to 
adding k times column i to column j, then (although ET # E;) multiplying 
by E7 on the left corresponds to adding k times row i to row j. 


To get from B to 
B' = ET... ETETBE,E, ... E,, 


therefore, we must apply column and row operations in pairs. Applying a 
column operation and the corresponding row operation takes the matrix 
B into the matrix ETBE,; the next pair takes this into the matrix 
Ei ETBE,E;; and so on. 


The next question to be settled is: how do we decide what column and 
row operations to perform? Perhaps the best way to answer this question 
is to go through an example. 


Example 
110 
Find a diagonal form for B= |1 1 If. 
010 
Step 1 


Use each non-zero diagonal element to reduce to zero each element in the 
row and column occupied. by that non-zero element. Starting with the 
first non-zero diagonal element, namely the top left-hand entry, b, = 1, 
we reduce b, and bz, to zero by subtracting column 1 from column 2, then 
subtracting row 1 from row 2: 


100 1 -1 0 100 
-1 1 OJ[BjO 1 00[2/[0 0 1 
001 0 01 010 


The second and third diagonal elements are now Zero, so we have to 
create a non-zero diagonal element in one or both of these positions. If 
bı #0 (i +j), add column j to column i, and row j to row i; this gives 


Step 2 


2b, in the ith place of the diagonal. In this case, then, we add column 3 to 
column 2, and row 3 to row 2: 


100 100 1 -10][100 
0 1 I[[-! 1 O|B|O 1 0[[0 1 0 
00 1 001 0 O0 1f[0o 1 1 


Step 3 


We can now use the 2 in the (22) position, to reduce to (23) and (32) 


elements to zero. We do this by subtracting $ column 2 from column 3, 
and 1 row 2 from row 3: 


1°00 10 0) fl -1 quo 0 
0 1 0]|-1 1 1/Bfo 1 0[JO 1 -4 
0 -ł} 1 0 0 ıj [0 1 ıj{0 0 1 


1 00][-1 4) fpo o 
E 1 ur 1 -+| = [o 2 | 
i - do 1 à 00 -i 


10 0 
Thus a diagonal form of B is B' = b 2 ol, with transition matrix 
00 -i 


1 -1 Y 
P= f 1 4j Since this is the transition matrix, the underlying 
0 1 + 


symmetric bilinear form has matrix representation B' with respect to the 
basis 


{(1, 0, 0), (-1, 1, 1), ($, -4, D} 
the coordinates of which are expressed in terms of the original basis. 


In the above example we have explicitly recorded all the elementary 
matrices, although at each stage one is the transpose of the other. This acts 
as a check, but can of course be dropped. 


Completing the Square 


For simple quadratic forms the method of completing the square is easy 
and useful. The idea is to express the quadratic form as a linear combina- 
tion of squares, and the method is to keep subtracting (or adding) appro- 
priately constructed multiples of squares so as to leave one less variable 
in the quadratic form each time. 


Example 
Suppose we have the quadratic form 
Q((x)) = x1 + 2x,x2 + 2x,x3 — x} + 3x3 
If we select a basis {a,, a; , a) of R? such that 
a(x) = kx? + ka x? + key x? 
where 
(x) = xim + xhar xja, — (I)e R?), 


then the choice of (a,,&;,,) as a basis for R? has diagonalized the 
expression for q((x)). In practice, rather than look for a,, a, æ, explicitly, 


31 


it is easier to look for expressions 

X,Q744X,  di5 X4 t d13X5 

X3 = 421X; + 422X2 + 423 X3 

X3 = 431X; + d33X? + a33 X3 
such that ` 

xbr2xaxl + 2xix, — x3 + 3x3 = kx? + kax? + ky x? 
First, can we choose x, and k, so that 

Q((x)) — kx? 


depends only on x; and x;? In other words, can we choose a,,x, + 4,2 X; 
+ 4,3 X5 such that the coefficients of x7, x,x; and x,x3 in (a4x, + 4,2 x; 
+ a,5x3)? are proportional to those in q((x))? If so, then all we do is 
subtract xj? multiplied by the relevant constant k, from q((x)) and we are 
left with an expression in x3, x3. 


Now the terms involving x, in the expansion of (a,,x, + @,2X2 + a,4x3)* 
are: 


S 4 
yy + 22,4, X1X2 + 24,1413 X, X3 
2 
= ay(24,x1 22,2 xixi 20,5 3iX3). 


Thus (provided a,, #0) we can choose k, — a, and then get the right 
answer by choosing 


4,, = coefficient of x? in q((x)), 
4,2 = } coefficient of xx; in q((x)), 
4,3 = $ coefficient of x,x in q((x)). 


(Note the similarity to the rule of thumb we mentioned in sub-section 
14.2 for finding the polar form of a quadratic form.) 


In this particular case, then, we take 
xy =X +x. +23 
and hence k, = 1. 
This gives 
4((9) — x1? = a((x)) — (t + x} + x3 + Qe yxy + 2x2 xy + 2x4x3) 
= —2x} + 2x3 — 2x, x5. 


To eliminate x, from the remaining expression on the tight, we apply the 
same technique; x, = a, x; + a, x, where a5; is the coefficient of xi, 
and az, is half the coefficient of x, x; . 


Thus 

xi = —2x;— x, 
and hence k, = —4. 
This gives E 

4(Q2) — xi? + be? = g((x)) — x? xd + 4x, xa + 22) 

= 45 

Finally, therefore, we get 

Q(x) = xi! - 3x? + $x? 

= (x xix) — 42x, + x5)? + bá 


(We have suppressed the minuses in x3, since (—1)? = 1.) 


32 


The matrix of g with respect to the new basis (which we have not calcu- 
lated explicitly) is 


1 00 
E 
0 0j 


This method of diagonalizing a quadratic form always works provided 
that, at each step in the process, there is an x? term left in the expression. 
If this fails to happen, then we have to alter our procedure and take an 
intermediate step to create non-zero elements on the diagonal. 


Example 


Q((X)) = xix5 + X2 x3 xx, 


In this case, we cannot straightforwardly eliminate x, by the subtraction 
or addition of a single square, since to eliminate the x,x, and X,X; terms 
we would have to create an x? term. The technique here is to define the 
intermediate variables 


xax +x 


xy =X% 2 xX. 


We clearly still have linear independence of the linear functionals x}, x} 
and x, and the expression for 9((x)) becomes 


4(()) = Gxi  3x2)xi — 3x3) + Qoi — 3x2 
+ xalxi + 4x2) 


= de? — be? + x14. 
Now we proceed exactly as before. Define 
xi =łx; + 3x5 
and we have diagonalized q: 


q(x) = 4x1? — 1x? — 3x3 
= 4(łx; + dx, + ba! — 46 — x)! — xi 


= A(x, x, + 2x5)? — Hx, — x2)? - xj. 


READ from line — 14 on page N166 to the end of the section on page N167. 


Note 


line —2, page N166 If XTBX = x.byx,and B is symmetric, then by = by = 
+ (the coefficient of the x, x, term in the quadratic expression). 


For example 


x? + 3xy t+ yt ex? €. bey + xt y? 


= [x "m i] k] 


3 


Exercises 


1. Use elementary row and column operations to find diagonal forms 
for the following symmetric matrices. (They are the matrices repre- 
senting the quadratic forms in the examples worked by the method of 
completing the square.) 


(i) 1 11 (ii) HE 
A-|1-10 B=|4 0 1 
| E +40 


2. Reduce to diagonal form by the method of completing the square 
the quadratic forms given in Exercise 1, page N162. (Do as many as 
you feel you need to.) 


Solutions 


lL () 1 1 1l 100 100 
1 -1 OJ =]0 1 0j4j0 1 0 
1 03 001 001 


Subtract row | from row 2, and then subtract column 1 
from column 2: 


1 0 I 100 1 -1 0 
0 -2 -1/=]-1 1 0ļj4|0 10 
t -1 3 001 0 01] 


Subtract row 1 from row 3, and then subtract column 1 
from column 3: 


1 0 0 100 1-1-1 
0-2-1|2|-1 1 0]|4]O 1 0 
0-1 2 -10 1 0 0 1 


Subtract $row 2 from row 3, and then subtract column 
2 from column 3: 


1 00 1 0 0) fl -1 -3 
l -2 o|-|- 1 aldo 1 4 
o o 4) [-4-3 1] lo o 1 


(N.B. The diagonal form obtained by completing the 
square can be obtained by: row 2 becomes $row 2, and 
* then column 2 becomes {column 2.) 


(ii) RE 100][100 
4 0 4|=|0 ! 0|B]O 1 0 
AR 


Since there is no non-zero element on the diagonal, add 
row 2 to row 1, and then add column 2 to column 1: 


1 4 1 110][100 
[i à i|-[2 olf | 
140 00 tf [0 0 1 


Subtract drow l from row 2, and then subtract 
column 1 from column 2: 


1 01 1 1 07 fl -4 0 
l ZEBETELL 10 
1 00 00 1]i0 01 


Subtract row 1 from row 3, and subtract column 1 from 
column 3: 


1 0 0 1 1 0) fl -$ -1 
k -4 j]- 4 o|p|| 4-1 
o 0-1] [-1-1 1] lo 0 1 


2. (a) 2x? + 3xy + 6y?: subtract (2x + 3y)?. 
This leaves 45y?. Thus 
2x? + 3xy + 6y? = AQx + dy)? + 44y*. 
(b) 8xy + 4y?: subtract 4(4x + 4y)? = 4(x + y)?: 
This leaves —4x?. Thus 
8xy + 4y? = 4(x + y}? — 4x?. 
(c) x? + Qxy + Axz + 3y? + yz + Tz? :subtract (x + y + 2zy.. 


This leaves 2y? — 3yz + 3z?: subtract 4(2y — 3z)?. This 
leaves 13z?. Thus 


x? + Ixy + 4xz + 3y? + yz + 72? 
: = (x + y + 22) + 4(2y — 3z)? + Wiz". 
(d 4xy=(x + y}? — (x — y}? 
(e) x? +4xy+ 4y?  2xz + z? + dyz: 
subtract (x + 2y + z)*. This leaves 0: thus 
x? + dxy + 4y?  2xz + z? + Ayz 
= (x + 2y + z}? 
(f) x? + 4xy — 2y? = (x + 2y}? — 6y?. 
(8) x? + 6xy — 2y? — 2yz + z^: subtract (x + 3y)?. 
This leaves —11y? — 2yz + z?: subtract (y — 2)?. This 
leaves —12y?. Thus 
x? 6xy — 2y! — 2yz + z? 
= (x + 3y)? + (y — zy - 12y?. 


14.3.2. Getting a Diagonal Matrix into Normal Form 


Having diagonalized the matrix of a bilinear form over V, we want to 
simplify the diagonal elements. 


The extent of the simplification depends on whether V is over the field of 
real numbers or the field of complex numbers. (We will not consider 
other fields.) Having got the matrix into diagonal form, we want to keep it 
there; this means that we can only multiply our basis elements by non-zero 
scalars. How much freedom of action does this give us? Suppose 
(2, ..., a,) is a basis of V with respect to which the matrix B = [b,,] of a 
bilinear form f is diagonal. Then if we change the basis element a, to aa, 
where a is a non-zero scalar, we find that b,, is replaced by a?b,,, 
for f(aa,,aa,)  a?f(a,) = a?b,,. Similarly, we can multiply any other 
diagonal element b, by a non-zero scalar. We still have the off-diagonal 
entries f (aa,, aj) equal to zero. 


Suppose first that our field of scalars is R. Then the factor a? multiplying 
a diagonal entry in this way is positive; furthermore, any positive number 
has a square root. Thus any non-zero element b, on the diagonal can be 
reduced to either 1 or —1 depending on the sign of 5,,, by multiplying 
the corresponding basis element by 


zl. (or NM if b, is negative). 
bu —Uu 

On the other hand, zero elements off the diagonal remain zero. Finally, 
we can re-arrange the basis elements so that the +1s come first, followed 
by the — Is, followed by the Os. This gives the following result. 


35 


Theorem 


If V is a vector space over R, the normal form for the matrix corresponding 
to a symmetric bilinear form f is a diagonal matrix whose diagonal entries 
consist entirely of 1s, — 1s and Os, which may for convenience be arranged 
in that order. 


If the field is C, however, the story is somewhat different; every element of 
C has a square root. Thus every non-zero diagonal entry b, in the matrix 
can be brought to +1 by multiplying the corresponding basis element 


by SES We therefore have the following result. 


Vou 


Theorem 


If V is a vector space over C, the normal form for the matrix correspond- 
ing to a symmetric bilinear form f is a diagonal matrix whose diagonal 
entries consist entirely of 1s and Os, with the Is preceding the Os for 
convenience. 


Exercises 


1, Write down the normal forms of the matrices of the quadratic forms 
in the final exercise of sub-section 14.3.1. Their diagonal forms are 
given in the solution to that exercise. Assume that the underlying 
field is R. 


2. Repeat Exercise 1 but take the underlying field to be C. 


Solutions 


ee S : 1 | 
00 1 


"em, om f o | 9 b -i 


( [10 oO 
E 0 
00 -1 


2. As above but with —1 replaced by 1 wherever it occurs, e.g. 
() [10 
0 1j. 


14.3.3 Uniqueness of the Normal Form 


In sub-section 14.3.2 we showed that, from a given diagonal form, we 
can get to a unique normal form (different for the fields R and C). However, 
we did not show that different diagonal forms for the same quadratic 
form will lead to the same normal form. What we have to show is that 
any diagonal form contains a. unique number of positive, negative and 
zero diagonal elements (in the case of R), or a unique number of non-zero 


and zero elements (in the case of C). If we can show this, then the normal 
form will be unique. 


First of all, it is easy enough to establish that the number of Zeros on the 
diagonal in the diagonal form of the matrix is unique. Since the diagonal 
form of B is P’BP for some non-singular matrix P, and since PT is also 
non-singular, it follows (Theorem 3.7 on page N48) that the rank of B is 
equal to the rank of P75P; and this is, of course, equal to the number of 


36 


non-zero diagonal elements of P™BP. This completely solves the unique- 
ness problem in the case of the complex field, C, as all we want to know 


is the number of non-zero, and the number of zero, diagonal elements in 
the diagonal form. 


Definition 


The rank of a quadratic form (or of the corresponding polar form) is 
equal to the rank of the matrix representing it. 


For the case of the real ficld, we want a further result; of the non-zero 
diagonal elements, we want to know that the number of positive elements 
is unique. This is proved in the first part of Section IV-11 of N. 


READ from the beginning of Section IV-11 on page N168 to line 5 on 
page N169. 


Notes 


G) Theorem 11.1, page N168 The proof depends on a theorem: 
dim U + dim W = dim (U + W) + dim (U n W), 


which we have not covered in this course. It is Theorem 4.8 on page N22. Here 
U + W means the subspace {a+ f: a eU and fl e W). 

(i) line —4, page N168 r is the rank ofq. 

(iii) lines —3 to — 1, page N168 It is important to remember what non-negative 
semi-definite and positive definite forms are, less important to remember what 
a signature is. A non-negative semi-definite quadratic form has no —Is (but 
possibly some 0s) in its normal form; a positive definite quadratic form has all 
Is in its normal form. In matrix language, A is non-negative semi-definite if 
and only if X*4X 70 for all one-column matrices X, and is positive definite 
if and only if, in addition, XTAX = 0 implies 


d 


Exercises 

l. Exercise 1, page N170. 
2. Exercise 2, page N170. 
3. Exercise 3, page N170. 
4. Exercise 4, page N170. 


(Use the results of the exercises of the preceding sub-section of this text 
to help you with Exercise 1. In Exercise 2, consider the cases a ¥ 0 and 
a = 0 separately.) 


Solutions 
l. 
Rank Signature 
(a) 2 2 
(b) 2 0 
(c) 3 3 
(d) 2 0 
(e) 1 1 
(f) 2 0 
(8) 3 1 


2. There are two cases to consider: a # 0 and a = 0. 
Case (i): a s 0. Completing the square gives 


06.9 = (ax +39)" + (-Z), 


37 


which is positive definite if and only if the coefficients 


1 2 
= and e 


_ are positive, i.e. if and only if a > 0 and b? — 4ac < 0. 
Case (ii): a=0 
Q(x, y) = bxy + cy? 


We have to show that Q(x, y) can never be positive definite 
in this case. If c is also zero, then completing the square 
gives 


Q(x, y) = bxy = ; (x+y)? — (x - y’). 


which cannot be positive definite. If c # 0, then 
1 (b 2 ba 

Ox, o - Gato) act 

d either he Oor ae 
and ei - de 


Thus if a — 0, Q(x, y) can never be positive-definite. 


3. The normal form of A is J; thus there is a real non-singular 
matrix Q such that Q7AQ = I. If we left-multiply each side 
by (Q^)? and right-multiply each side by Q^!, we get 


4-(QQ'^. 
Thus Q^! is the required matrix. 


4. If we pre-multiply ATA by the non-singular matrix 
PT = (A7)? and post-multiply it by the non-singular matrix 
P = A`}, we get 
PTATAP =1. 
Thus ATA has Z as its normal form, and is therefore positive 
definite. 


14.3.4 An Application of Real Quadratic Forms 


Let us look again at the example in sub-section 14.2.3, (the two-variable 
Taylor expansion). We had 


f 05, x2) = f(0, 0) + PG, x2) + Q(x, x2) (1) 
where P is a linear functional on R?, and Q is a quadratic form on R? 
(see page 25). Suppose /,(0, 0) = /2(0, 0) = 0, so that P(0,0) = 0. Then, 
as we showed in Unit M100 15, f has a stationary value at (0, 0), and may 
have a local maximum or minimum at (0, 0). We did not discuss in the 
above unit, however, the technique of working out from the second 
partial derivatives what kind of stationary value it is. For functions of 
one variable, the technique.is: if /'(xo) — 0, then there is a stationary 
value at xo, which is a local maximum if f*(x;) < 0 and a local minimum 
if f*(xo) > 0. If f*(xo) — 0, this classification method breaks down and 
another method must be used. 


An analogous situation exists for functions of two variables. For simplicity 
of notation, we will suppose the stationary value is at X, =X, = 0; then 
Equation (1) becomes 


SEn x) e f(0, 0) + Ox, x2) Q 


since 
P(x,, x1) = x, fi(0, 0) + x2 f2(0, 0). 


38 


Now let (o4, 05 be a basis of R? with respect to which Q is in normal 
form; let 


(xy, x5) = utt + Uy ot; (Ga, x3) e R?) 


so that u, and u, are the new variables with respect to which we express 
Equation (2). 


Then Equation (2) has one of the following forms. 


If the normal form of Q is F il , then 


f, 25) =f (0, 0) + ui + u$. (3) 
" . [1 0 

If the normal form of Q is 0 -1b then 

F(%1, x3) = f(0, 0) + uj — u3. (4) 
If the normal form of Q is Fi E then 

fe x) = f(0,0) — ut — uj. (5) 

. [1 0 

If the normal form of Q is li ne then 

Sœ x7) = f0, 0) + uF. (6) 


Thus, if the (real) normal form of Q is k 1 i.e. if Q is positive defin- 


ite, then there is a local minimum at (0, 0). 


form [o 1] 


In the case of the normal form | | f(x, x;) increases with in- 


creasing u if we go away from (0, 0) in the direction of «,, but decreases 
with increasing u, if we go away from (0, 0) in the direction of x5. Thus 
we get a saddle point at (0, 0) shown below. 


39 


LM 14.3.4 


om. [0-1] 


tangent plane 


In the case of the normal form E A , we clearly get a local maxi- 


mum at 0. z 


roms 8 


For Equation (6), we do not know whether /(x,, x2) increases or decreases 
in the a, direction; the approximation is not good enough to tell us. The 


same goes for the normal forms: f A La d Thus we can classify 
the stationary point using Q, whenever the rank of Q is 2, and we fail 
whenever the rank is less than 2. 
Example 
Classify the stationary point at (0, 0) of the function 
Í: (X1 x1) ——9 4 x1 — 3x1x5 x5 b xp xix, 
(G4, x;) e R°) 


The quadratic approximation is (dropping the terms of higher than 
quadratic degree) 


f 05, x2) = 440+ OK, x5) 
with 
O(x1, x3) = x1 — 31x, + x3 
= (%1 — 3x2)? dx 


(completing the square). 


40 


The normal form is therefore fo tl and consequently the stationary 
point is a saddle point. 


Example 
Classify the stationary point at (0, 0) of the function 
f: (X1, X2) À—— cos (x, + xj) (1, x2) e R?) 
The first stage is to calculate the first and second partial derivatives: 
JG, x1) = —sin (x, + x) 
fa, X2) = —sin (x, + xi) 
Jiu, x2) = —cos (x, + xi) 
Sales, x1) = —cos (x, + x2) 
Saal% x3) = —cos (x, + x2) 


f 06 x3) f (0, 0) + x, f (0, 0) + x2f2(0, 0) 
+ xi fu (0, 0) + x1x2/120, 0) + 4x3 /22(0, 0) 
= f(0, 0) + x, sin (0) + x, sin (0) 
— 4x? cos (0) — x,x; cos (0) — 4x3 cos (0) 
= f(0,0) — xi — xix — d 
= f(0, 0) — 4x, + x)? 


and so the quadratic form has [- s el as a normal form, and this method 
does not enable us to classify the stationary point in this case. 

The method is more successful in the following cases, which we give as 
an exercise. 

Exercise 


Classify the stationary point (0, 0) as a local maximum, local minimum, 
or saddle point for the following functions. 


(i) S: (X1, x2) —— cos(x, + x2) + cos (x, — x2) — (Qu x2) e R?) 


Gi) g: (x1, x3) —> cos (x, + x2) — cos (x, — x2) (Œ x2) E€ R?) 


Solution 


(i) Ai, x2) = —sin (x, + x2) — sin (x, — x2) 
fa, x3) = —sin (x, + x2) + sin (x, — x2) 

Ju x2) = —cos (x, + x;) — cos (x, — x2) 

Sial%, x3) = —cos (x, + x2) + cos (x, — x2) 

fai, X3) = —cos (x, + x2) — cos (x, — x2) 


Thus 
A0, 0) = 20, 0) =0, 
fu(0,0)  -2 
F:20, 0) =0 
F220, 0) = -2 
S% x2) = f(0, 0) — x1 — xi. 


The normal form is [o al and hence the stationary 


point (0, 0) is a local maximum. 


4 


(ü) Here, 
&11(0, 0) = —cos (0) + cos (0) = 0 
£12(0, 0) = —cos (0) — cos (0) = -2 
822(0, 0) = —cos (0) + cos (0) = 0 
&(X1, x2) = g(0, 0) — 2x, x2 
= g(0, 0) + (x, — x2)? — 365 + x2)? 
The normal form is R | and hence the stationary point 


(0, 0) is a saddle point. 


14.3.5 Summary of Section 14.3 
In this section we defined the terms 


rank (page N164) 

signature (page N168) 

non-negative semi-definite (page N168) 

positive definite (page N168) 
Theorems 


l. (10.1, page N163) 

For a given symmetric matrix B over a field F (in which 1 + 1 # 0), there 
is a non-singular matrix P such that P" BP is a diagonal matrix. In other 
words, if f, is the underlying symmetric bilinear (polar) form, there is a 
basis A’ = (a, ... . a5) of V such that f,(o;, aj) = 0 whenever i + j. 

2. (page C36) 

If Vis a vector space over R, the normal form for the matrix corresponding 
to a symmetric bilinear form f is the diagonal matrix: 


1 


3. (page C36) 
If V is a vector space over C, the normal form for the matrix corresponding 
to a symmetric bilinear form / is the diagonal matrix: 


1 


0 
4. (Result from pages C36-7 and Theorem 11.1, page N168) 
The number of positive and negative and null elements in a normal 
form matrix representing a quadratic form over R is unique. 
The number of zero and non-zero elements in a normal form matrix 
representing a quadratic form over C is unique. 


Techniques 


I. Given a quadratic form over R or C, find its normal form by the 
method of row and column operations and by completing the square. 

2. Determine the basis for a diagonal form for a quadratic form. 

3. Given a particular f(x,, xj), say what you can about maxima and 
minima at (0, 0). 


.42 


*o9 2 E 


14.0 SUMMARY OF THE UNIT 


As far as theory is concerned, this unit is a logical extension of Unit /2, 
Linear Functionals and Duality. The idea of a linear functional leads on 
to the idea of a bilinear functional, or bilinear form, and from there to 
the concept of a quadratic form, which is non-linear. There is a one-one 
correspondence between quadratic and symmetric bilinear forms which 
proves to be a useful tool, enabling us to analyse certain non-linear 
problems using linear techniques. 


In the first section we looked at various kinds of bilinear forms and their 
matrix representation. We discovered that the matrices representing the 


same bilinear form with respect to different bases exhibit an equivalence 
relation called congruency. 


The second section investigated the relationship between a quadratic 


form q and its corresponding symmetric bilinear form, called the polar 
form of q. 


In the third section we developed two techniques for finding a simple 
representative matrix for a bilinear form, termed the normal form. Finally 
we applied the theory covered in this unit to analyse the stationary points 
of a suitably differentiable real-valued function of two variables. 


Definitions 
bilinear form (page N156) 
symmetric bilinear form (page N158) 


anti-symmetric bilinear form (page C14) 
skew-symmetric bilinear form (page N158) 


symmetric matrix (page N158) 
skew-symmetric matrix (page N159) 
congruent (page N158) 
quadratic form (page N160) 
polar form (page N161) 
rank (page N164) 
signature (page N168) 
non-negative semi-definite (page N168) 
positive definite (page N168) 
Theorems 


1. (8.1, page N158) : 
A bilinear form f is symmetric if and only if any matrix B representing f 
has the property B7 = B. 


2. (82, page N158) : 
If a bilinear form f is skew-symmetric, then any matrix B representing f 
has the property B7 = —B. 


3. (8.3, page N158) 
If 1 4- 1 40 and the matrix B representing f has the property B7 = — B, 
then f is skew-symmetric. 


4. (8.4, page N159) i 
If 1 + 1 #0, every bilinear form can be represented uniquely as a sum of 
a symmetric bilinear form and a skew-symmetric bilinear form. 


43 


+» +» *or o* 


+» * o *o*£ 9 * o» *o* 


» +++, 


5. (9.1, page N161) 

Every symmetric bilinear form f, determines a unique quadratic form 
by the rule g(a) = f(a, a), and if 1 + 1 #0, every quadratic form deter- 
mines a unique symmetric bilinear form f(a, f) = 3(g(« + f) — q(a) —q(8) 
from which it is in turn determined by the given rule. There is a one-to-one 
correspondence between symmetric bilinear forms and quadratic forms. 


6. (10.1, page N163) 

For a given symmetric matrix B over a field F (in which 1 + 1 + 0), there 
is a non-singular matrix P such that P7BP is a diagonal matrix. In other 
words, if f, is the underlying symmetric bilinear (polar) form, there is a 
basis A’ = {a}, ...., a) of V such that f,(a/, a) = 0 whenever i # j. 


7. (page C36) 
1f V is a vector space over R, the normal form for the matrix corresponding 
to a symmetric bilinear form f is the diagonal matrix: 


1 


8. (page C36) A 
If V is a vector space over C, the normal form for the matrix Corresponding 
to a symmetric bilinear form f is the diagonal matrix: 


1 


=) 
9. (Result from pages C36-7 and Theorem 11.1, page N168) 
The number of positive and negative and null elements in a normal form 
matrix representing a quadratic form over R is unique. 


The number of zero and non-zero elements in a normal form matrix 
representing a quadratic form over C is unique. 


Techniques 


l. Given a bilinear form f, find f,, fas 

2. Given a particular function of two variables, find the quadratic 
Taylor approximation. í 

3. Given a quadratic form over R or C find its normal form by the 
method of row and column operations and by completing the square. 

4. Determine the basis for a diagonal form of a quadratic form. 


5. Given a particular f(x,, x2), say what you can about maxima and 
minima at (0, 0). 


Notation 
Sa, B) ^ (page NI59) 


Sala, B) ^ (page N159) 
(x) (page N161) 


* 


14.5 SELF-ASSESSMENT 
Self-assessment Test 


This Self-assessment Test is designed to help you test your understanding 
of the unit. It can also be used, together with the summary of the unit, 
for revision. The answers to these questions will be found on the next non- 


facing page. We suggest that you complete the whole test before looking 
at the answers. 


1. Let f be the following bilinear form on R?: 


S), O) = xy — xa + xiyi + X273 + 3x3 y; = X33 
(x), 0) € R?) 


Calculate: 


(i) the symmetric part, f,, of f, and its matrix; 
(ii) the skew-symmetric part, f,,, of f, and its matrix; 
(ii) the quadratic forms on R? corresponding to: 


(a) f 
O S 
© fs 


2. Show that the following formula gives the polar form for a quadratic 
form g on R^: 


460.0) = Èn) 


where q(((x)) is the partial derivative of q with respect to x,. 


3. Show that if A is a matrix with real entries, then ATA is the matrix 
of a real non-negative semi-definite quadratic form. 


4. Determine whether the stationary point at (0, 0) of the following 
function is a local maximum, a local minimum, or a saddle point. 


fiu, x1) — e+ p en mio mu 2gni7 mi 4 297 niti 
[CEDE R?) 


45 


Solutions to Self-assessment Test 


1. 


46 


G) f 0), 00) = xix + 2x13 + 2x31 — Xs Ys; 


matrix = [1 0 0]. 
00 2 


02 -l 


G) F(X), 00) = — x12 + X11 7 Xi + X3: 


matrix = [0 —I 0]. 
1 0 -1 
0 1 0 
(iii) (à) 9((x)) = x3 + 4x, x5 — xd 


(6) 9((x)) = x} + 4x2. x3 — x3 
() 9((x)) =0 


If a, is the coefficient of the x? term in q, and b; the coefficient of the 
x,x, term (i + j), for each i, j, then the partial derivative with respect 
to x, will have terms in x,, x2,...,%,- The coefficient of the x, term 
will be 2a,, and the coefficient of the x, term (j # i) will be b,j. If we 
multiply this derivative by $y, and sum over i, we get a bilinear form 
whose x,y, term has coefficient a,, and whose x,y, term (i #/) has 
coefficient 35,,. This is the same as the formula detived in sub-section 
14.2.2 for the polar form of q. 


If A is m x n, then ATA is the product of an n x m with an m x n 
matrix, which is defined, and is an n x n matrix. It therefore defines 
a quadratic form on R”, given by 


9((x)) = XT(ATA)X 


where X is the one-column matrix corresponding to (x). If we let 
Y = (y, ..., Ym) be the one-column matrix AX, then 


a(x) = YTY 


20 for all (x). 
Thus ATA defines a non-negative semi-definite quadratic form. 


fixi x) = eta oun 2guon M De omit 
fixi x) = et a eT ETT L 2er 4 277ta 
fuu X2) = et emm 2er p Dem mbH 
SialXi, X2) = e't 4 eo ni731 — 295i7x1 _ Dem tte 


fai, X2) = t5 p e iT + 2g m 4296708 


Thus: 
f,(0, 0) 20 
f,(0, 0) =0 
f (0,0) = 6 
fi2(0, 0) = -2 
F22(0, 0) = 6 


The quadratic Taylor approximation to f about (0, 0) is therefore 
f, x1) = f(0, 0) + x, f (0,0) + x1 fi(0, 0) + 1x1 f, ,(0, 0) 
+ X,x2f;2(0, 0) + $x3/2,(0, 0) 
= 6 + 3x} — 2x,x, + 3x3 
=6 + 43x, xj + 4x] 
Thus the normal form for the matrix of the quadratic form is 


b 1] 


and the stationary point is a local minimum. 


4 


LINEAR MATHEMATICS 


*O 00 20v t^ 4 0 IN — 


48 


Vector Spaces 

Linear Transformations 

Hermite Normal Form 

Differential Equations I 

Determinants and Eigenvalues 

NO TEXT 

Introduction to Numerical Mathematics: Recurrence Relations 
Numerical Solution of Simultaneous Algebraic Equations 
Differential Equations II: Homogeneous Equations 
Jordan Normal Form 

Differential Equations III: Nonhomogeneous Equations 
Linear Functionals and Duality 

Systems of Differential Equations 

Bilinear and Quadratic Forms 

Affine Geometry and Convex Cones 

Euclidean Spaces I: Inner Products 

NO TEXT 

Linear Programming 

Least-squares Approximation 

Euclidean Spaces II: Convergence and Bases 

Numerical Solution of Differential Equations 

Fourier Series 

The Wave Equation 

Orthogonal and Symmetric Transformations 
Boundary-value Problems 

NO TEXT 

Chebyshev Approximation 

Theory of Games 

Laplace Transforms 

Numerical Solution of Eigenvalue Problems 

Fourier Transforms 

The Heat Conduction Equation 

Existence and Uniqueness Theorem for Differential Equations 
NO TEXT 


335 01105 5 


